智能论文笔记

A generative recommender system with GMM prior for cancer drug generation and sensitivity prediction

Krzysztof Koras , Marcin Możejko , Paulina Szymczak , Eike Staub , Ewa Szczurek

分类：机器学习 | 人工智能

2022-06-07

高通量药物筛查测定法的最新出现引发了机器学习方法的密集开发，包括预测癌细胞系对抗癌药物的敏感性的模型，以及用于生成潜在药物候选者的方法。然而，尚未全面探索具有特定特性的化合物产生具有特定特性和同时建模其功效的概念。为了满足这一需求，我们提出了Vadeers，这是一种基于各种自动编码器的药物功效估算推荐系统。化合物的产生是由具有半监视的高斯混合模型（GMM）的新型自动编码器进行的。先验定义了在潜在空间中的聚类，其中簇与特定的药物特性相关联。此外，Vadeers配备了单元线自动编码器和灵敏度预测网络。该模型结合了抗癌药物的微笑弦表示的数据，它们对蛋白激酶的抑制作用，细胞系生物学特征以及细胞系对药物的敏感性的测量。评估的Vadeers变体在真实和预测的药物敏感性估计之间达到了较高的R = 0.87 Pearson相关性。我们以一种方式训练GMM先验，使潜在空间中的簇通过其抑制作用对应于药物的预计聚类。我们表明，学到的潜在表示和新生成的数据点准确地反映了给定的聚类。总而言之，Vadeers提供了一种全面的药物和细胞系特性模型及其之间的关系，以及引导的新型化合物。

translated by 谷歌翻译

Defense Against Adversarial Attacks on Audio DeepFake Detection

Piotr Kawa , Marcin Plata , Piotr Syga

分类：机器学习

2022-12-30

Audio DeepFakes are artificially generated utterances created using deep learning methods with the main aim to fool the listeners, most of such audio is highly convincing. Their quality is sufficient to pose a serious threat in terms of security and privacy, such as the reliability of news or defamation. To prevent the threats, multiple neural networks-based methods to detect generated speech have been proposed. In this work, we cover the topic of adversarial attacks, which decrease the performance of detectors by adding superficial (difficult to spot by a human) changes to input data. Our contribution contains evaluating the robustness of 3 detection architectures against adversarial attacks in two scenarios (white-box and using transferability mechanism) and enhancing it later by the use of adversarial training performed by our novel adaptive training method.

translated by 谷歌翻译

Fast-moving object counting with an event camera

Kamil Bialik , Marcin Kowalczyk , Krzysztof Blachut , Tomasz Kryjak

分类：计算机视觉

2022-12-16

This paper proposes the use of an event camera as a component of a vision system that enables counting of fast-moving objects - in this case, falling corn grains. These type of cameras transmit information about the change in brightness of individual pixels and are characterised by low latency, no motion blur, correct operation in different lighting conditions, as well as very low power consumption. The proposed counting algorithm processes events in real time. The operation of the solution was demonstrated on a stand consisting of a chute with a vibrating feeder, which allowed the number of grains falling to be adjusted. The objective of the control system with a PID controller was to maintain a constant average number of falling objects. The proposed solution was subjected to a series of tests to determine the correctness of the developed method operation. On their basis, the validity of using an event camera to count small, fast-moving objects and the associated wide range of potential industrial applications can be confirmed.

translated by 谷歌翻译

Synthetic Image Data for Deep Learning

Jason W. Anderson , Marcin Ziolkowski , Ken Kennedy , Amy W. Apon

分类：计算机视觉 | 机器学习

2022-12-12

Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models. In this work, we explore how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle. We use this dataset to quantify the effectiveness of synthetic augmentation using U-net and Double-U-net models. We found that, for this domain, synthetic images were an effective technique for augmenting limited sets of real training data. We observed that models trained on purely synthetic images had a very low mean prediction IoU on real validation images. We also observed that adding even very small amounts of real images to a synthetic dataset greatly improved accuracy, and that models trained on datasets augmented with synthetic images were more accurate than those trained on real images alone. Finally, we found that in use cases that benefit from incremental training or model specialization, pretraining a base model on synthetic images provided a sizeable reduction in the training cost of transfer learning, allowing up to 90\% of the model training to be front-loaded.

translated by 谷歌翻译

Neural Representations Reveal Distinct Modes of Class Fitting in Residual Convolutional Networks

Michał Jamroż , Marcin Kurdziel

分类：机器学习 | 计算机视觉

2022-12-01

We leverage probabilistic models of neural representations to investigate how residual networks fit classes. To this end, we estimate class-conditional density models for representations learned by deep ResNets. We then use these models to characterize distributions of representations across learned classes. Surprisingly, we find that classes in the investigated models are not fitted in an uniform way. On the contrary: we uncover two groups of classes that are fitted with markedly different distributions of representations. These distinct modes of class-fitting are evident only in the deeper layers of the investigated models, indicating that they are not related to low-level image features. We show that the uncovered structure in neural representations correlate with memorization of training examples and adversarial robustness. Finally, we compare class-conditional distributions of neural representations between memorized and typical examples. This allows us to uncover where in the network structure class labels arise for memorized and standard inputs.

translated by 谷歌翻译

EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks

Liyu Shi , Xiaoyan Li , Weiming Hua , Haoyuan Chen , Jing Chen , Zizhen Fan , Minghe Gao , Yujie Jing , Guotao Lu , Deguo Ma

分类：计算机视觉

2022-12-01

Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients.

translated by 谷歌翻译

Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Sanya B. Taneja , Tiffany J. Callahan , Mary F. Paine , Sandra L. Kane-Gill , Halil Kilicoglu , Marcin P. Joachimiak , Richard D. Boyce

分类：人工智能

2022-09-24

当植物天然产物与药物共容纳时，就会发生药代动力学天然产物 - 药物相互作用（NPDIS）。了解NPDI的机制是防止不良事件的关键。我们构建了一个知识图框架NP-KG，作为迈向药代动力学NPDIS的计算发现的一步。 NP-KG是一个具有生物医学本体论，链接数据和科学文献的全文，由表型知识翻译框架和语义关系提取系统，SEMREP和集成网络和动态推理组成的构建的科学文献的全文。通过路径搜索和元路径发现对药代动力学绿茶和kratom-prug相互作用的案例研究评估NP-KG，以确定与地面真实数据相比的一致性和矛盾信息。完全集成的NP-KG由745,512个节点和7,249,576个边缘组成。 NP-KG的评估导致了一致（绿茶的38.98％，kratom的50％），矛盾（绿茶的15.25％，21.43％，Kratom的21.43％），同等和矛盾的（15.25％）（21.43％，21.43％，21.43％ kratom）信息。几种声称的NPDI的潜在药代动力学机制，包括绿茶 - 茶氧化烯，绿茶 - 纳多洛尔，Kratom-Midazolam，Kratom-Quetiapine和Kratom-Venlafaxine相互作用，与已出版的文献一致。 NP-KG是第一个将生物医学本体论与专注于天然产品的科学文献的全文相结合的公斤。我们证明了NP-KG在鉴定涉及酶，转运蛋白和药物的药代动力学相互作用的应用。我们设想NP-KG将有助于改善人机合作，以指导研究人员将来对药代动力学NPDIS进行研究。 NP-KG框架可在https://doi.org/10.5281/zenodo.6814507和https://github.com/sanyabt/np-kg上公开获得。

translated by 谷歌翻译

An Improved Algorithm For Online Reranking

Marcin Bienkowski , Marcin Mucha

分类：机器学习

2022-09-11

我们研究了在线偏好聚合的基本模型，其中算法保留了$ n $元素的有序列表。输入是首选集$ r_1，r_2，\ dots，r_t，\ dots $的流。在看到$ r_t $并且在不了解未来集合的情况下，必须将算法重新读取元素（更改列表订购），以便至少在列表的前面找到$ r_t $的一个元素。所产生的成本是列表更新成本的总和（相邻列表元素的互换数量）和访问成本（列表中$ r_t $的第一个元素的位置）。这种情况自然发生在诸如使用商店客户聚集的偏好中在线商店订购的应用程序中。该问题的理论基础称为Min-sum集盖。与以前的工作（Fotakis等人，ICALP 2020，NIPS 2020）不同，主要研究了在线算法ALG对静态最佳解决方案（单个最佳列表顺序）的性能，我们在本文中，我们研究了一个更难的变体，其中一个更难基准是可证明的更强的最佳动态解决方案OPT（也可能会修改列表排序）。就在线商店而言，这意味着其用户群的汇总偏好随时间发展。我们构建了一种计算高效的随机算法，其竞争比（alg-opt成本比）为$ O（r^2）$，并证明存在确定性$ O（r^4）$ - 竞争算法。在这里，$ r $是集合$ r_t $的最大基数。这是第一个算法的比率不依赖于$ n $：此问题的先前最佳算法是$ O（r^{3/2} \ cdot \ sqrt \ sqrt {n}）$ - 竞争性和$ \ omega（r ）$是任何确定性在线算法的性能的下限。

translated by 谷歌翻译

Segmentation of Weakly Visible Environmental Microorganism Images Using Pair-wise Deep Learning Features

Frank Kulwa , Chen Li , Marcin Grzegorzek , Md Mamunur Rahaman , Kimiaki Shirahama , Sergey Kosov

分类：计算机视觉

2022-08-31

环境微生物（EMS）的使用通过监测和分解污染物提供了高效，低成本和无害的环境污染补救措施。这取决于如何正确分段和确定EMS。为了增强透明，嘈杂且对比度较低的弱可见EM图像的分割，在本研究中提出了成对深度学习功能网络（PDLF-NET）。 PDLFS的使用使网络通过将每个图像的成对深度学习特征与基本模型Segnet的不同块相连，从而使网络更加关注前景（EMS）。利用shi和tomas描述符，我们在贴片上提取每个图像的深度特征，这些图像使用VGG-16模型以每个描述符为中心。然后，为了学习描述符之间的中间特征，基于Delaunay三角定理进行功能的配对以形成成对的深度学习特征。在该实验中，PDLF-NET可实现89.24％，63.20％，77.27％，35.15％，89.72％，91.44％和89.30％的出色分割结果，分别为IOU，DICE，DICE，VOE，灵敏度，精确性和特定性，精确性和特定性，精确性和特定性，精确性和特定性。

translated by 谷歌翻译

HTML版本

Chosen methods of improving object recognition of small objects with weak recognizable features

Magdalena Stachoń , Marcin Pietroń

分类：计算机视觉 | 人工智能

2022-08-29

许多对象检测模型在小物体检测的几个有问题的方面努力，包括样本数量少，缺乏多样性和低特征表示。考虑到甘斯属于生成模型类，其最初的目标是学会模仿任何数据分布。使用适当的GAN模型将增强低精度数据，从而增加其数量和多样性。该解决方案可能会导致改进的对象检测结果。此外，将基于GAN的架构纳入深度学习模型可以提高小物体识别的准确性。在这项工作中，提出了基于GAN的方法，以改善VOC Pascal数据集上的小物体检测。将该方法与不同流行的增强策略（例如对象旋转，换档等）进行比较。实验基于QuasterRCNN模型。

translated by 谷歌翻译